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THE TOTAL PATH LENGTH OF SPLIT TREES 

By Nicolas Broutin and Cecilia Holmgren 
Inria, and Inria and Cambridge University 

We consider the model of random trees introduced by Devroye 
[SIAM J. Comput. 28 (1999) 409-432]. The model encompasses many 
important randomized algorithms and data structures. The pieces of 
data (items) are stored in a randomized fashion in the nodes of a tree. 
The total path length (sum of depths of the items) is a natural mea- 
sure of the efficiency of the algorithm/data structure. Using renewal 
theory, we prove convergence in distribution of the total path length 
toward a distribution characterized uniquely by a fixed point equa- 
tion. Our result covers, using a unified approach, many data struc- 
tures such as binary search trees, m-ary search trees, quad trees, 
median-of-(2fc + 1) trees, and simplex trees. 

1. Introduction. In this paper we investigate the total path length, that 
is, sum of all depths, of random split trees defined by Devroye [13] (we will 
be more precise shortly). Split trees model a large class of efficient data 
structures or sorting algorithms. Some important examples of split trees are 
binary search trees (which are also the representation of Quicksort) [24], 
m-ary search trees [47], quad trees [19], median-of-(2A; -|- 1) trees [4], simplex 
trees; all these are covered by the results in this document. The case of 
tries [21] and digital search trees [12] is also important in practice [54]; 
however, their treatment necessitates different tools, and we leave this case 
for later studies. 

The magnitude of the depths in tree data structures naturally influences 
their efficiency; in the case where the tree represents the branching choices 
made by an algorithm, the depths are related to the running time of the 
algorithm. In this sense, the sum of the depths is a natural and important 
measure of the efficiency of tree data structures or sorting algorithms. 

The path length of tree data structures has been studied by many authors, 
but in most cases the analyses and proofs are very much tied to a specific 
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case. The main result of this study is to prove that for a large class of split 
trees, the total path length converges in distribution to a random variable 
characterized by some fixed point equation. In that sense our result extends 
the earlier studies of Rosier [50, 52] and Neininger and Riischendorf [45] who 
used the so-called contraction method to show convergence in distribution 
of the total path length for the specific examples of the binary search trees, 
the median-of-(2A; + 1) trees and quad trees. Our method actually relies on 
previous work of Neininger and Riischendorf [45] who gave a limit theorem 
for the path length of general split trees, under the assumption that the 
mean satisfies some precise asymptotic form, which we prove. 

Plan of the paper. In Section 2, we introduce the model of split trees 
of Devroye [13]. We also discuss previous work on the path length and similar 
topics. This is also the place where we state our main result. Theorem 2.1. 

In Section 3, we explain our general approach, which relies heavily on 
previous work by Neininger and Riischendorf [45]. These authors stated 
a general condition for convergence in distribution of the path length, and 
our contribution is to prove that it indeed holds for a large class of split 
trees. So Section 3 is included so that the reader has a general view of the 
argument. 

Once we have stated the precise condition in Section 3, we will move on 
to explaining our approach to proving it in Section 4. Finally, in Section 5 
we discuss extensions of our results. 

2. Split trees and path length: Notation and background. We introduce 
the split tree model of Devroye [13]. Consider an infinite rooted 6-ary tree 
(every node has b children). The nodes are identified with the set of finite 
words on an alphabet with b letters, U = IJn>o{^' ■ • ■ ' ^}"- '^^^ ^'-"-'^ repre- 
sented by the empty word 0. We write u^v to denote that u is an ancestor 
of V (as words, tt is a prefix of v). In particular, for the empty word 0, we 
have for any v 

A split tree of cardinality n is constructed by distributing n items 
(pieces of data) to the nodes u£U. To describe the tree, it suffices to define 
the number of items Uu in the subtree rooted at any node u £U. The tree T" 
is then defined as the smallest relevant tree, that is, the subset of nodes u 
such that nu> (which is indeed a tree) . 

In the model, internal nodes all contain sq > items, and external nodes 
can contain up to s items. The construction then resembles a divide-and- 
conquer procedure, where the partitioning pattern depends on a random 
vector of proportions. Let V = {Vi, . . . , Vb) satisfy > and Yli ^ = 1; each 
node uGU receives an independent copy Vu of the random vector V. In the 
following, we always assume that F{3i:Vi = 1) < 1. We can now describe 
{nu, uGU). The tree contains n items, and we naturally have riz. The split 
procedure is then carried on from parent to children as long as > s. 
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Given the cardinality and the spht vector Vv = (V^, V2, • • • , Vb) of v, the 
cardinahties (n^,^ , n^^ , ni,,^ ) of the b subtrees rooted at f 1 , f 2 , • • • , Wfe are 
distributed as 

(1) Mult(n^, - So - bsi,Vi,V2, ...,Vt) + (si, si, • • • , si), 

where < s and < bsi < s + 1 — sq. 

Depending on the choice of parameters sq,si,s and the distribution of 
V = {Vi, . . . , Vb), many important data structures may be modeled, such as 
binary search trees, m-ary search trees, median-of-(2A: + 1) trees, quad trees, 
simplex trees (see [13]). To make sure that the model is clear and to give 
a hint of the wide applicability of the model, we illustrate the construction 
with two canonical examples. 

Example 1 (Binary search tree). The binary search tree is one of the 
most common data structures for sorted data. Here we assume that the data 
set is {!,..., n}. A first (uniformly) random key is drawn ai, and stored 
at the root of a binary tree. The remaining keys are then divided into two 
subgroups, depending on whether they are smaller or larger than ai. The left 
and right subtrees are then binary search trees built from the two subgroups 
< ai} and {i:i> ai}, respectively. The sizes of the two subtrees of the 
root are ai — 1 and n — ai. One easily verifies that, since ai is uniform in 
{1, 2, . . . , n}, one has 

(0-1 - 1, n - 0-1) = Mult(?2 - 1; [/, 1 - [/), 

where U is a, uniform U{0, 1) random variable. Thus, a binary search tree 
can be described as a split tree with parameters 6 = 2, so = l, s = l, si=0 
and V is distributed as {U, 1 — U) for U a random variable uniform on [0, 1]. 

Example 2 (Digital trees or tries). We are given n (infinite) strings Xi, 
...,Xn on the alphabet {1,...,6}. The strings are drawn independently, 
and the symbols of every string are also independent with distribution on 
{1, . . . , 6} given hy pi, ... ,pb. Each string naturally corresponds to an infinite 
path in the infinite complete 6-ary tree, where the sequence of symbols indi- 
cates the sequence of directions to take as one walks away from the root. The 
trie is then defined as the smallest tree so that all the paths corresponding to 
the infinite strings are eventually distinguished; that is, for every string Xi, 
there exists a node u in the tree such that Xi is the only string with u<Xi. 
The internal nodes store no data; each leaf stores a unique string. In this 
case, riy is the number of strings that have prefix v, and one clearly has for 
the children of the root 

(ni, . . . , nfc) = Mult(n;pi, . . . ,pb). 

The trie is thus a random split tree with parameters s = 1, sq = si = and 
1^ = (Pi)P2) • • • iPb) almost surely. 
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An algorithmic point of view. Rather than using the divide-and- 
conquer description above, the random trees may be equivalently defined 
using incremental insertion of data items into an initially empty data struc- 
ture. The items are labeled using {1, 2, . . . , n} in the order of insertion. Ini- 
tially, n„ = for every u£U. We first sample the i.i.d. copies of V that are 
assigned to the nodes u £U. 

• Upon insertion, an item first trickles down along a random path from the 
root until it finds a leaf (i.e., a node u such that all its children ui,. . . ,Ub 
satisfy =0). If the path currently corresponds to a word v gU, and v 
is not a leaf, then it is extended to Vi , the ith child of v with probability Vi , 
where {Vi, . . . , Vb) is the copy of V associated with v. 

• When the first phase is finished, the item is stored in a leaf, say v. The 
leaves can contain up to s items. So if n^, < s (before the insertion), then 
the item is stored at v, and all the for u^v are updated. 

• If = s, there is no space for the new item at v. With the new item, we 
formally have re^ = s + 1. In this case, sq of these s + 1 items are randomly 
chosen to remain at v while the other s + 1 — sq are distributed among 
the children vi, . . . ,Vj, of Each child receives si items chosen at random. 
The remaining s + 1 — sq — bsi each choose (independently) a child Vi at 
random with probability Vi, where (Vi, . . . , V;,) is the copy of V at node v. 
If si = sq = 0, it may happen that all s + 1 items now lie at one child Vi, 
in which case the scheme is repeated until a stable position is found. [This 
happens with probability 1, since P(3z : = 1) < 1.] This last step is the 
reason why an item may move down when a further item is inserted. 

The properties of the multinomial distribution ensure that the tree T" 
obtained in this way has the correct distribution (see [13] for details). 

In the present case we can assume without loss of generality that the 
components of V are identically distributed; applying a random permutation 
to the components would leave the path length unchanged. We now let V 
denote a uniformly random component of V. So, for instance, E[y] = l/b 
and P{V = 1) < 1/6 by our assumption that P(3i : 1^ = 1) < 1. 

Background and previous work. The labeling of the items induced 
by the algorithm above is interesting for the analysis. Let Di be the depth 
of the item labeled i when all n items have been inserted. Then, the total 
path length is 

n 

1=1 

The analysis of the depth D„ of the last item n is thus tightly related to 
the analysis of ^(T„), and yet is much simpler since it avoids the intricate 
dependence between the Di. Devroye [13] proved a weak law of large num- 
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bers and a central limit theorem for D„ in general split trees. Let A be 
a component of (Vi, . . . , 14) picked with probability proportional to its size; 
that is, given {Vi, . . . , Vh), let A = Vj with probability Vj. We write 

^t:=E[-lnA] = bE[-VlnV] and 

(2) 

:= Var(lnA) =6E[yin2y] -/i^. 

Note that fj, € (0,oo) and o" < oo. Then Dn/lnn converges in probability 
to and Inn — t- fi~^ (Devroye assumed that F{V = 1) = 0, but 

this assumption can be relaxed as long as V satisfies P(y = 1) < 1/6; this is 
done using trees in which edges are weighted by geometric random variables 
(see, e.g., [6, 7])). If we also have o" > 0, then 

Dri — Inn 
y (y II '^mn 

in distribution where A^(0, 1) denotes the standard Normal distribution. 
Note that a > precisely when V is not monoatomic, that is, if hV ^ 1 
with positive probability. 

The total path length \I'(T"') itself has been extensively studied for specific 
cases of split trees. The first moment follows from that of Dn since 

n 

E[^(r")] = ^E[A]. 

i=l 

For instance, in the binary search tree, we have [23] 

(3) E[^'^^'^(r'')] = 2nlnri + n(27-4) + 21nn + 27 + l + C'(n-^), 

where 7 is Euler's constant. For higher moments and the distribution of ^(T"") 
one needs to carefully take the dependence in the terms of the sum into ac- 
count. Most studies of this type concern the model of binary search tree, or 
equivalently the cost of quicksort (e.g., [18, 18, 49, 50, 55]). Let 

^ ^BST(^n)_E[^BST(^n)] 

(4) Yn :- - . 

Using martingale arguments, Regnier [49] showed that Yn converges in dis- 
tribution to a random variable Y . Rosier [50] showed that Y is satisfying 
the following distributional equality: 

(5) Y = UY +{l-U)Y* + C{U), 

where C{u) := 2nlnn+ 2(1 — M)ln(l — u) + 1, ?7 is uniform on [0, 1], Y and 

Y* = Y are independent. He also proved that the stochastic equality in (5) 
actually characterizes the distribution of Y: there exists a unique solution Y 
of (5) such that E[y] = and Var(y) < 00. The distribution of Y is usually 
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called the quicksort distribution. Properties of Y and the rate of convergence 
of Yn to Y are studied in [17, 18, 50, 55]. 

The aim of the present study is to prove that the path length exhibits 
a similar asymptotic behavior regardless of the precise model of split tree: 

Theorem 2.1. Let ^(T^) be the total path length in a general split tree 
with split vector V = (Vi, . . . , Vh). Suppose that P{3i:Vi = l) <1. Let 

X„:= ^ ^ ^ ^ and CiV) = 1 + -Y VilnV^. 

1=1 

If C{V) 7^ with positive probability, then X in distribution, where X 

is the unique solution of the fixed point equation 

b 

X = J^I4X(^) + C(V), 
fc=i 

satisfying E[X] = and Var(X) < oo. Furthermore, exponential moments 
of Xn exist and converge ^[e'^^'^] — )• E[e'*'"'''] for any A G M. 

As mentioned in the Introduction, Neininger and Riischendorf [45] proved 
a version of Theorem 2.1 conditional on the type of asymptotic expansion for 
E^(T"); our contribution is to prove that this expansion indeed holds (The- 
orem 3.1), which implies the unconditional version stated in Theorem 2.1. 

We have recently been informed that, based on a Markov chain repre- 
sentation of Bruhn [9] and coupling arguments, Munsonius [44] has shown 
a result similar to our Theorem 2.1 in the special case when the distribution 
of V has a density with respect to Lebesgue measure. 

Discussion and remarks about the assumptions, (i) When the 
split vector V is deterministic, that is, V is a permutation of some fixed 
vector (pi, . . . ,Pb), the cost function C(V) = 0. Such a split tree is a digital 
tree [54]. In some sense, part of Theorem 2.1 still holds, but the limit X is 
trivial since X = almost surely. The renormalization is actually too strong, 
since the variance in this case should be of order nlogn, rather than [and 
order n in the special case when 6V = (1, . . . , 1)]. The total path length for 
binary tries has been treated by Jacquet and Regnier [30]. They showed 
that the variance of ^'(T„) is of order 0{n) ii p = q and of order ©(nlogn) 
a Py^ q and that the path length is asymptotically normal. Schachinger [53] 
showed that, for tries with a general branch factor, the variance of the total 
path length for general tries is ©(nlog^n). See also [34, 35]. 

(ii) In general, in the case of digital trees [when C{V) = 0], it is expected 
that under the correct rescaling the limit distribution should be normal. 
Neininger and Riischendorf [46] gave a general conditions under which limit 
distributions are Gaussian. The case of the binary tries is one example when 
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this theorem can be apphed as an alternative proof to the method in [30] . In 
general, to apply the result in [46] one needs to have approximations for the 
first two moments of the path length. This is the reason why we report the 
analysis of this case: a lot more work is required to estimate the variance to 
the correct order. 

(iii) It might seem at first that one should have C(V) = when InV is 
lattice (trie case). However, one can easily construct examples with C(V) ^ 
and InV lattice: for instance, take 6 = 5 and V a random permutation of 
either (1/2,1/8,1/8,1/8,1/8) or (1/2,1/4,1/4,0,0), each with probability 
1/2. 

(iv) Note, although it might come as a surprise since our main tool is 
renewal theory. Theorem 2.1 does not require any condition on arithmetic 
properties related to the vector (Vi, . . . ,14). In particular, it holds whether 
— \nV is lattice or not. However, the behavior of the average path length 
does depend on arithmetic properties of InV; see Theorem 3.1 later for 
details. 

(v) Note that the limit fixed equation only depends on V, so in particular, 
the limit distribution X does not depend on the parameters s,so or si. 
However, the average E[^'(r"')] should clearly depend on these parameters, 
although we do not prove it formally. 

(vi) For the sake of simplicity, we cover only trees with bounded degree, 
which is usually the case for trees representing data structures. The path 
length of recursive trees, which do not have bounded degree, has been studied 
by [14, 38]. 

3. The contraction method for path length. The condition stated by 
Neininger and Riischendorf [45] to ensure weak convergence of the path 
length concerns the asymptotics of the average path length. More precisely, 
if one has, for some constant <f, 

(6) B[^{T"')]= fi-^nlnn + c^n + o{n) 

and P(C(V) / 0) > 0, then Theorem 5.1 of [45] ensures that X„ — ^ X in dis- 
tribution. The purpose of this section is to explain why these conditions are 
sufficient to prove Theorem 2.1. In particular, we give the necessary back- 
ground about the contraction method, and we explain the general approach 
that has been devised in [45]. This section is included only to put our re- 
sult in context, and no new result is proved with respect to the contraction 
method. 

Note first that (6) holds in the case of binary search trees (3). Recall 
that Di is the depth of the ith item in the construction where items are 
inserted one after another. It is not difficult to deduce from the results 
on Di by Devroye [13] that 

E[^'(T'')] = i2~'^nlnn + nq{n) 
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with q{n) = o(lnn) (see Theorem 2.3 of [26] for a formal proof). So prov- 
ing (6) reduces to proving that q{n) — t- as n — t- oo. Our contribution is to 
prove that this is indeed the case as soon as the random variable V is such 
that —InV is not lattice, that is, there is no a G M such that —InVGaX 
almost surely. In the following, we let 

d := sup{a > : P(ln VeaZ) = l}, 

so that d is the span of the lattice when d> and In V is nonlattice when 
d = 0. More precisely, we prove: 

Theorem 3.1. The expected value of the total path length ^(T") exhibits 
the following asymptotics, as n— t-oo; 

(7) E[^'(T")] = fi~^nlnn + nw{lnn) + o{n), 

where /x is the constant in (2) and w is a continuous periodic function of 
period d. In particular, iflnV is not lattice, then d = and w is constant. 

If InV is nonlattice, then Theorem 3.1 and Theorem 5.1 of [45] together 
prove Theorem 2.1. If the random variable \iiV is lattice with span d, then 
Theorem 3.1 implies that q{n) = ■cj{lnn) + o(l) as n — t- oo, where vj is d- 
periodic. So it seems that Theorem 3.1 does not permit to conclude along 
the arguments by Neininger and Riischendorf [45]. However, the techniques 
in [45] only require convergence of the coefficients of a certain recursive 
equation; this fact was used in [46] to deal with certain cases involving 
oscillations. 

We now move on to the approach developed by Neininger and Riischendorf 
[45, 46] . Let n = (ni , . . . , n^) denote the vector of cardinalities of the children 
of the root. Then we have, for n> s, 

b 
i=l 

where ^'^(T"'*) are copies of ^(T"*) that are independent conditional on 
(ni, . . . ,nh). Introducing the normalized total path length 

(8) Xn := ^(^")-E[^(T")] ^ 

n ' 

we can rewrite the distributional identity above as 

b 

^-^ n 

i=l 

where 

n n ^ n 

i=l 
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and X„- , i £ {1, . . . ,b} , are independent conditional on (ni, . . . , Ub). By def- 
inition, the vector of cardinalities n is Mult(n — sq — bsi, Vi,V2, ■ ■ ■ , Vb) + 
(si, si, . . . , si) so that 

(9) (^,^,...,^)^V. = iVuV,,...,Vb), 

\ n n n J 

almost surely as n— )• oo. This is where (6) comes into play: it ensures that 
the cost Cn (n) (the "toll function" ) in the recursive distributional equation 
does converge (in distribution) as n— )■ oo. Indeed 

c.(w)=i+ix:Ei*(T-)i-ao_£E 

n ^-^ n n 

i=l 

= l + iy!!lln!^ + ify!^n7(lnn,)-c^7(lnn)l +o(l). 
/i ^—^ n n n \~[ / 

Now, by (9) and the continuity of vj, it follows that 




since w is d-periodic and In Vi £ dZ by assumption (if d = 0, (p is constant 
and the claim also holds). Note that, apart from (9), only asymptotics for 
the first moments are required for (10) to hold. Together (9) and (10) suggest 
that if Xn converges in distribution to some limit X, then X should satisfy 
the following fixed point equation: 



b b 
(11) X = Y,VkX^'"^ + CiV) where C(V) = l + -^yi In Fi, 

k=i ^ i=i 

and X^'^) are independent and identically distributed copies of X. 

The point of the contraction method is to make the previous arguments 
rigorous, that is, to show that if the coefficients C„(n) do converge, then (11) 
has a unique solution X and that X^ X in distribution; this is precisely 
what was done in [45, 46]. This is done by proving that the recursive map 
defined by (11) is a contraction in a suitable space of probability mea- 
sures [48, 50, 51]. We now expose the lines of the arguments to show the 
extent of the results that follow from the mere convergence of the coeffi- 
cients Cn{n). (We claim no novelty.) 
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Let ^2 be the set of probability measures with a finite second moment. 
For a random variable X, we write T^iX^ for its law. For ^ S ^2 and X a ran- 
dom variable with law = 0, define the L^-norm by ||X||2 = E[X^]-'^/^. 
We can then define a metric d2 on ^2 (the Mallow metric): for £ ^2^ 
let 

(12) (i2(<^,V'):=inf||X-y||2, 

where the range of the infimum is the set of couples (X, y) with marginal 
distributions T^iX) = (p and 'D{Y) = (p. For simplicity we write d2{X,Y) = 
^2(0,93) for random variables X and Y, but note that this only depends on 
the marginal distributions 4> and (p. Convergence of (pn to 4> in (^2;C^2) is 
equivalent to weak convergence with convergence of the second moment [48] : 

(13) *„40 and f^44,.{.)^f^m^). 

Let ^2 be the subset of ^2 containing distributions (j) such that J x d(j){x) = 
0. Define the operator T:^2 ~^ ^2- ^'^^ ^ distribution (p> S ^2^ let T((/>) 
be the distribution of the random variable given by 

T4Z('=) + C(V), 

i<fc<fe 

where Z^*) are i.i.d. random variables with distribution ^. Then, calculations 
similar to that in the proof of Lemma 3.2 in [45] yield 

d2(r(x),r(y))< ^\vl\-d2{x,Y) 

l<i<b 

= b'E[V^]-d2{X,Y). 

Since 6E[y^] < 1 the operator T is a contraction in (^2; ^2)- Thus the 
Banach fixed point theorem implies that T has a unique fixed point. The 
random variable X has this fixed point as distribution. The same line of 
thought actually implies that d2{Xn,X) — ?• 0. A formal proof can be found 
in [45]. As stated in (13), the convergence in (^2')'^2) is strong enough to 
imply convergence of second moments. In particular, 

Var(^(r"))~C^^ 

where C = Var(X). Computing E[X^] using the fixed point equation, one 
easily obtains the following expression for 

(14) c = Var(X) = ^LMEk}i^iM^. 

This expression may also be obtained using estimates based on renewal 
theory in the spirit of our proof of Theorem 3.1. 
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4. Precise asymptotics for the average path length. 

4.1. Plan of the proof of Theorem 3.1. In the previous section, we have 
explained why precise asymptotics for E[^'(T"')] imply convergence in dis- 
tribution of ^'(T") (suitably rescaled). We now move on to the proof of 
Theorem 3.1. 

Recall that Di denotes the depth of the ith inserted item. Write z G if 
the item i is stored in the subtree rooted at u. Then rearranging the sum in 
the definition of ^'(r"), we see that 

n n 

(15) ^(T") = E ^-^ = E E i{^eT.} = E 

i=l i=l «7^(T u^a 

Recall the following fact, which we used already in Section 3: 

-Muh(n;yi,...,T4)^(yi,...,V'fe), 
n 

almost surely, as n — )■ oo. We actually have a similar behavior for any random 
variable n^, when is a fixed node (so in particular, its depth does not 
depend on n). For a node u, the components Vi, 1/2; ■ • ■ > ^fo of are naturally 
associated to the children ui,U2, ■ ■ ■ ,Ub oi u, and we can define = Vi. For 
the root node 0, define V0 = 1. Then let 



(16) Lu = llV, 



where u ^ n if t> is an ancestor of u. The random variables {Lu, u £U) define 
a recursive partition of [0, 1] , where Lu is the length of the interval associated 
with u. In general, for any fixed node u, we have 

n 

almost surely as n — )• 00. So, as long as riy is large it should be well approx- 
imated by riLy. This suggests that the sum in (15) be decomposed into the 
contributions of the top and of the fringe of the tree. We define the sepa- 
ration in terms of a parameter B measuring the size of the trees pending 
in the fringe. The lengths Ly are decreasing on any path from the root. So 
let R be the collection of nodes such that r G R ii r has nLr < B but for all 
its strict ancestors v we have nLy > B. We write Tr,r £ R, for the subtrees 
rooted at the nodes that belong to R. 
Then 



(17) E[^(T")] = E 



■y^0 



+ E 



E*(r"'-)+n, 



since given Ur, the total path length of T^, r G R, is distributed like T""". 
[The term needs to be added since the cardinality of the root of a tree T 
is not taken into account from our definition of ^(T).] The following two 
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propositions gather the asymptotics for the two terms in (17) above that 
will enable us to prove Theorem 3.1. In the following, we let 

d = sup{a > : P(ln V (^aT) = l}. 

Indeed, as we already mentioned (it will become clear soon), the arithmetic 
properties of In V influence the asymptotics. 

Proposition 4.1. There exists a constant K such that, for all n large 
enough, and all B, we have 



E 



X] ^v'^{nL„>B} 









nlnl 











< K 



n 



where fi is the constant in (2) and 0i is a continuous d-periodic function; 
in particular, (pi is constant when d = 0. 

Proposition 4.2. There exists a constant K such that, for all n large 
enough, all e > small enough and B = , we have 



(18) 



E 



ft 

nipB ( In — 



< Ken 



for some ^b, « d-periodic function that depends on B. Furthermore, there 
exists a constant K' ( independent of B) such that, for e > small enough, 

(19) sup \^B{q)-'fB{q')\<K'e\n{l/e). 

The proofs of Propositions 4.1 and 4.2 both rely on renewal theory: first, 
the sum Sn,B is easily approximated by a function of sums of i.i.d. random 
variables; second, the sizes Ur in the second contribution can be estimated 
using overshoot arguments. The necessary technical lemmas are introduced 
in the following section. Then, we prove Propositions 4.1 and 4.2 in Sec- 
tions 4.3 and 4.4, respectively. 

Before we proceed to the proofs of Propositions 4.1 and 4.2, we prove that 
they indeed imply Theorem 3.1. The nonlattice case should be rather clear, 
but the lattice case requires a little care. 

Proof of Theorem 3.1. We have been precise in the statements of 
Propositions 4.1 and 4.2; we now take the liberty to use O(-) notation to 
simplify the discussion. It is understood that the hidden constants do not 
depend on n,e or B. 

(i) First assume that InV is nonlattice (d = 0). Let n,n be integers such 
that n<n. Fix e > 0, and choose B = e~'^^ . Then by the triangle inequality 
and Propositions 4.1 and 4.2, 

'E[^'(r")] \ /E[^'(r")] 



n 



Inn 



n 



^ Inn 



0{e) 
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as n — 7- oo. Thus, the sequence (n~^E[^'(r")] — fi"^lnn,n > 0) is Cauchy, 
hence the result. 

(ii) If InV is lattice, the situation is different since we cannot directly 
invoke similar arguments. In particular, we need to prove the existence 
and continuity of the function w. Fix /3 G [0,d) and consider = {n > 1 : 
3k £ N, |lnn — kd + (3\ < n~^}, the set of integers such that Inn mod d is 
close to /3. Then, by the triangle inequality and Propositions 4.1 and 4.2, we 
have 



E[^'(T")] 



n 



< 



In- 



n 



Inn 



n 



^ Inn 



In- 



n 



+ 



BJ "V B 
+ 0{e)+0{l/B) 
(/)i(lnn) — 01 (Inn) I + [(^^(Inn 



In- 



n 

'b 



In- 



n 

'b 



(^ij(lnn)l +0(e), 

if we choose e in such a way that -B = e"^*^ = /3 mod d. Now, 0i is contin- 
uous and (i-periodic so that there exists no (independent of /3) such that 
|(/>i(ln?i) — 01 (Inn) I < e when n,n > no inside Vtp. On the other hand, for 
n,n £ such that n,n> 2e~^, we have 

|99B(lnn) - v3B(lnn)| < K'e\n{l/e). 

Note that the bounds obtained are all uniform in /3. It follows that for 
every e > 0, there exists ni = max{no, e~^} such that for n, n G fi^ satisfying 
n, n > ni , we have 

E[^'(r")] \ /E[*(T")] 



n 



In 77, 



n 



Inn 



<0(e) + K'eln(l/e). 



Therefore, the subsequences (n~^E[^'(T")] — /i"-"^ lnn,n G $7^), /? G [0,d), are 
uniformly Cauchy (in /3). It follows that there exists a fixed function 
defined on [0,d) such that, for every (3 and n G O^, 

E[^'(r")] = -nlnn + nzi7(/3) +o(n). 

Furthermore, the function zu is continuous. This is easily seen using the 
same arguments with nGil/j, nGfi^/ and |/3 — /3| < e. Once the definition 
of Tu is extended by periodicity, the continuity ensures that we can write 
the asymptotics for E[^'(T")] in the form claimed in (7). This completes the 
proof in the lattice case. □ 



4.2. The renewal structure of split trees. Renewal theory has already 
been used for studying random trees in [26, 28, 32, 42, 43]. The present 
paper is another example of its wide applicability. We start by quantifying 
the deviation between n^ and nLy for fixed nodes v £U. 



14 



N. BROUTIN AND C. HOLMGREN 



Lemma 4.1. For any node v, we have for all x large enough 
P(|n„ - nL^I > | nL^ > x) < x""^/^. 

Proof. First note that by the triangle inequahty 

P(|n^, — nL^I > (nLi,)'^^^ \ nL^ > s) 

< P(2|nt, - Bin(n,Lt,)| > {nL^f^'^ \ nL^ > x) 

+ P(2|Bin(n,L„) - nL^,] > (nL„)^/^ | nL„ > x). 

Suppose that |u| = (i and let be the c-field generated by the random 
variables Vu for \u\ <d. Conditioning on Jf^, the recursive splits of the car- 
dinalities n„ defined in (1) give in a stochastic sense the following bound 
for 



(20) 



Bin(n,L^,)| <st'^'B'm{s, Ly / Lu) 



u-<v 



Now, by (20), Chebyshev's inequality and Chernoff's bound for binomials 
(see, e.g., [11, 25, 33]) we obtain 

P(|?i„ — nLv\ > (nL^,)^/^ | nL„ > x) 



< 2x-2/3e 



+ E 



exp 



^Bin(s,L„/L„) 
-(nL,)4/3 



u~<v 



{nL, + (nL,)2/3/6) 
<25x-2/3^6-'= + e-^^'< 



nL„ > X 



fc>0 



for all X large enough. □ 



When the cardinalities are close to the product nL„, renewal theory 
allows us to get approximations suitable to prove Propositions 4.1 and 4.2. 
It is convenient to introduce the additive form S^, = — InL^. For \v\ = k, 



\v\ 



Si, 



Sk = y^-inFi, 



i=l 



where Vi, i > 1, are i.i.d. copies of V. We define the exponential renewal 
function 



(21) 



U{t):=J2b''F{Sk<t), 



k=l 
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which satisfies the following renewal equation with u{t) = bP{—lnV <t): 



(22) U{t) = i^{t) + {U*du){t) wheTe{U*du){t)= U{t-z)di^{z). 

Jo 

The measure di'{t) is not a probability measure. To work with more con- 
venient renewal equations, involving probability measures, we introduce the 
tilted measure du{t) = du{t). It is easily seen that dw{t) is a probability 
measure, and defines a random variable X by P(X G dt) = duj{t). In fact lo 
is the distribution function of —In A, where A is the size-biased random 
variable in (2): writing / for a random variable that is i with probability Vi 
given {Vi, . . . , Vf,), we have 

P(- In A < x) = EE[l{_i,v^^<^| \{Vi,..., Vh)] 

b 



E 



El 

Li=l 



{-\nVi<x}Vi 



-InVi 



;(x). 



= bE[l^_inV<x}e~ 
Then, from (2), X obviously satisfies 

E[X]=E[-lnA] = /i and E[X2] = cj^ + /x^. 
The renewal equation (22) can then be rewritten as 

(23) U{t)=d{t) + {U *du:){t), 

where U{t) := e~^U{t) and P(t) := e~*;/(t). The first-order asymptotics for 
U{t) as t — 7- oo follows from the standard renewal theorem applied to U (t) 
(see also Theorem 7.1, Chapter V of [1] or Lemma 3.1 of [26] for a formal 
proof): 

(24) C/(t) = f7(t)e* = ^f-^e* + o(e*), t^oo. 

We will need some information about the second-order behavior of U{t). 
The following lemma will be sufficient for us. 



Lemma 4.2. Let d = sup{a> 0:P {In V £ a2 
is nonlattice. Then, as x ^ oo 



1}, so thatd = if InV 



(25) 



e-\Uit)- fi-^e^)dt 



111 



2^1 



,2 

li-^ + (t){x) + o{l), 



ifd = 0, 
ifd>0, 



where 4'{x) is a hounded continuous periodic function with period d. 
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Proof. Let Xk be i.i.d. copies of a random variable X defined by P(X € 
dt) = du{t). Define the (standard) renewal function 

(26) F(t):=^p(f;Xfc<t]. 

n>0 \fe=l / 

Then the renewal theorem (Theorem V.2.4 of [1]) applied to (23) yields 

(27) e^^U{t) = U{t)= / d{t-u)dF{u)= / D{u)dF{t-u). 

Jo Jo 

[Note that dF{t) includes a term dP{0 < t) = '^o(^)-] By Fubini's theorem we 
obtain 



(28) 



e-\U{t)-fi-^e^)dt= / u{u) / dF{t-u)du-- 

Jo Jo A* 

u{u)F{x — u)du . 

^^ 



Recall that v{x) = v[x)e ^ . Integration by parts gives 

/■OO f-OQ 

(29) / i?(x)dx = fe[-e~*P(-lny <t)][f+ / e"* (ii/(t) = 6E[e-^°^] = 1. 
Jo Jo 

Rewriting (28) as a single integral, it follows that 
e-\U(t)- yr^e^)dt 

v(u) I F{x — u) \du 

(30) V i^) 

= / v{u)udu / v{u)xdu 



+ y v[u) {f{x - u) - - — du. 

We start with the first two terms in (30). Using again integration by parts 
and applying (29) yields 



oo 



i'(u)udu= \ e ^v{u)udu 



(31) = / v{u)du^ \ ue-'^dviu) 

Jo Jo 

= l + 6E[-FlnF] = l + /i, 
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where the last equahty fohows from the definition of in (2). Finally, note 
that for all x, 

poo /•oo 

(32) / i'{u)xdu< I i'{u)udu^O 

J X J X 



as x — 7- oo since Jq°° |p'(n)u| du < oo. 



So it only remains to estimate the third term in (30). This is related to 
the asymptotics for the renewal function F{t), which are different depending 
on whether InV is lattice or not. Write {x} for the fractional part of a real 
number x, that is, {x} = x — [x\. Then, by Theorem 5.1 in [22] we have, as 
t — >• oo, 

in the nonlattice and the d-lattice case, respectively. Furthermore, by Lor- 
den's inequality ([37], Theorem 1), 

0<F(t)-i<^. 

(i) We now first assume that \nV is nonlattice. The dominated conver- 
gence theorem applied to the last integral in (30), and (29), yield 

lim / d{u)(F{x-u)-- — -^l[^<^ydu= [ u{u) ^ du 
x^ooJq y ^ J Jq Ifi 

2/i2 • 

Putting (33) together with (30), (31) and (32) we obtain, as x — )• oo, 

/ e-\U{t)-fi-'e')dt = l + _^ + o(l), 

Jo A* 2//2 

which proves the claim in (25) in the nonlattice case. 

(ii) Similarly in the lattice case with span d, from the dominated conver- 
gence theorem we obtain 

z?(u) ( F{x — u) — - — — ) du 
V / 

(34) =^!JV^^ /-Vl /^lV(„,rf„ + „(l) 



2// fi Jq \2 I d 

C72+//2 d /-oo/l (x-U. , , 

+ - / o- — ^ l^iu)du + oil) 



2/^2 ^ y2 I d 
by (32). The function <j) defined for x > by 

m=- r(i-i^}]Hu)du 



Jo 
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is clearly d-periodic. Furthermore, the function </>(•) is continuous. Indeed, 
for any x, y such that |a; — y| < e we have 



Hy) = - , 
^J' Jo 



d 
Jo 



oo / 2^ 

2 



d 

y-u 
d 



^d f°° fl 



fJ' Jo 



y-u 
d 



v{u) du 

l{j/-n mod d<^[e,l-e\}^{u) du 
^{y-u mod di[e,l-e]}^{u) du. 



It follows that 

\4>{y) - (l){x)\<-e + 2 sup - 



Z — U 



'^{z-u mod d(^[e,l-e]y^{u)du 



2 d 
< — e + 2 sup — 

^^ z£{x,y} Jo 



roc 

/ ^{z-u mod d(^[e,l-e]}'i^{u)du. 

Jo 



Since |i^(it)| = e~^bF{—\nV <t)<b, the dominated convergence theorem 
implies that | — (pi^M — )• as e — )• 0. 

Finally, putting (34) together with (30), (31) and (32) as before proves 
the lattice case in (25). □ 

4.3. Contribution of the top of the tree. In this section, we prove Propo- 
sition 4.1. For the top of the tree, the sizes n„ are well approximated by 
Bin(?i,Lt,). This suggests that the main contribution of the top of the tree 
should be 



(35) 



E 



E 



+ Rn,B 



^Bin(n,L„)l|„i^>B} 

for a remainder Rn,B that should be small. We first estimate the main con- 
tribution; we will then quantify Rn. 

_B using (20). 



Lemma 4.3. Let d = sup{a : P(lny G aZ) = 1}, so that d = if InV is 
nonlattice. Then, asn/B^oo, 



E 



^Bin(n,L„)l{„i„>B} 



( 1 



-nln( —J "roin), 



ifd = 0, 



n 



2^2 



n 



+ n(j)( In — ] +o{n), ifd>0, 



where fi and a are the constants in (2) and 4>{-) is a bounded continuous 
d-periodic function. 
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Proof. Let V^, i > 1 be i.i.d. copies of V, and define = Y[i=i ^ 
S'fc = — InLfc. Tlien, we liave 



E 



J]]Bin(n,L„)l{„i^>B} 



nE 



nE 



n 



n 



^b''Lkl{nLk>B} 
k>l 

y^fe°'e^'^'''l|g^<ln„-lnB| 
fc>l 

/.ln(n/B) 

/ V6V*dP(S,.<t) 

^0 fc>l 



ln(n/B) 



where ^7(t) is the renewal function defined in (21). Using integration by 
parts we obtain, if — Inl/ is nonlattice, 

rln{n/B) 

e"* dU{t) 



+ 



ln(n/B) 



e-*U{t)dt 





R rln{n/B) 

-U{ln{n/B))+ / e-*([/(t)-;U-ie*)dt 
n Jo 

+ /i~^ln(n/B) 



:/i-i+o(l) + 



cr — fi 



2/z2 

+ /i~iln(n/B)+o(l) 

by Lemma 4.2 and (24). Similarly if — \nV is lattice with span d, Lemma 4.2 
and (24) yield 



i 



ln{n/B) 



e"* dU{t) = + o{l) + 



a — fi 



+ ;U"Mn(n/S) 



2/^2 

+ 0(ln(n/S))+o(l), 
where 4>{t) is a continuous periodic function with period d. □ 

We now deal with the remainder Rn,B introduced in (35). The difference 
between n„ and the binomial is bounded in (20) and we have 



Rn,B\<E 



X] ^{nL^>B}'^'B^T^'^{s,Ly/Lu) 
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Lemma 4.4. The following estimate holds: there exist a constant and tiq 
such that, for every fixed B and n>no, we have 



E 



u~<v 



o 



Proof. In the following, \v\ = d, \u\ = k < d, and we write i = d — k. 
Then L„ is distributed as = ■ Li, where the two factors are products 
of k and i copies of V, respectively; all of them are independent. Swapping 
the sums over u and v, we obtain 



E 



(36) 



E 



,>B} 



< sB 



k>0 £>0 



'-k>o e>o 

First conditioning on 5^ in each term of the sum above, and recalling the 
renewal function U{t) defined in (21), we see that 

r.ln(n./B)-Sfe 



E 



£>0 



Sk 


-L 







e~* dU{t) + hi 



{e^k<n/B}- 



However, there exists a constant C such that, for any real number x, 

f e-*dC/(t)<Cxl|,>o}. 

JO 

Going back to (36) and choosing x = \i'i(n/B) — Sk, it follows that 
J2 ^{nL^>B}^'B^'^{s,Ly/Lu) 



E 



<CE 



C 







X&^(ln(n/S) -Sk + b)l{s,<\n{n/B)} 
-k>0 

\n{n/B) 

(ln(n/B) - t + 6) dU{t) 

\n{n/B) 



C[{Hn/B)-t)U{t)], 

rHn/B) 

+ C' I U{t)dt, 
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where the last hne fohows by integration by parts and we wrote C = C(l + 5) . 
The claim then follows from (24). □ 

4.4. Contribution of the fringe: Proof of Proposition Finally, we 

prove Proposition 4.2 that deals with the contribution of the fringe of the 
tree. Recall that from (17), we have to estimate 



(37) 



E 



E 



where, for convenience, we introduced ^(T^) := ^'(T*'') + A;. The proofs here 
get quite technical at times, and the reader should bear in mind that we will 
essentially express the expected value in (37) as a mixture of the expected 
values of B[i!{T^)], for k lower than B. 

For a node r, define the conditional expectation P,. = E[^(T""') | n,.]. First, 
the first asymptotic order of the expected total path length implies that 



(38) 



0{nr Inrir 



The next lemma is used to get an error bound for the sum of the expected 
total path lengths of the subtrees Tr,r £ R, with cardinalities that differ 
from nLr by at least B^^^ items, so that we only have to bother about the 
subtrees Tr,r £ R, with cardinalities that are close to nL^. 



Lemma 4.5. The following error bound holds: 



E 



2_^nrlnnrl^\nr-nLr\>B'2/3} 
-reR 



o 



nlnB 
SV4 



We omit the proof; it follows by a simple modification of the proof of 
Lemma 4.3 of [26]. By Lemma 4.5, we have 



E 



E 

reR. 



:E 



^ 1 nf'^^^^\ 

Z^r^l{|n,-n,L,|<B2/3} + ^ ( "^TTT ) " 
r£R J \ / 



Define i?' C i? to be the set of "good" nodes in R: 

(39) R' := {re R:\nr -nLr\<B'^/^} 

and let R" C R' be the subset of nodes r £ R' that also satisfy nL^ > e^. 

We will now explain that it is enough to consider the nodes r E R" . The 
approximation of U (t) in (24) implies that the expected number of nodes v 
such that nL^ > B is 0{n/B); thus, since each node has at most b children, 

(40) E[\R\] = 0{n/B) 
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as well. Hence, it follows from (39) that the expected number of nodes in 
the Tr, r G R', with nLr < e^B is bounded by 0(e^?i). Using this fact yields 



(41) 



E 



E 



E 



E 

tS-R" 



nlnB 



Because of the concentration of rir around nLr , the cardinalities rir of the 
nodes r € R are naturally related to the behavior of the "overshoot" of the 
renewal process (— InL^, A; > 0), when it crosses the line ln{n/B). Estimating 
the empirical distribution of the cardinalities of the nodes r £ R will allow 
us to approximate the right-hand side above. So we further subdivide the 
nodes r £ R into smaller classes according to the values of nLr, r G R. 

Let Z = {B,B - jB, B - 2jB, e^B}, where we let 7 = e^. We write 



Rz ^ R, z £ Z , for the set of nodes r E R, such that nLr G [z 
Then (41) can be rewritten as 



(42) E 



■re-R 



E 



E E 

zeZreR'nR:, 



Tr 



+ 0{e'^nlnB) + 



nlnB 



Even in a fixed class Rz, not all the nodes have the same cardinality n^. So, 
in order to estimate the expected value in (42) we need the following lemma 
that quantifies the discrepancy of E[^'(r")] under small variations of n. 

Lemma 4.6. There exists a constant C such that, for any natural num- 
bers n and K , we have 

|E[$(r"+^)] - E[$(T")]| < CK\n{n + K). 

Proof. From the iterative construction, we clearly have E[^'(T"+'^)] > 
E[^(r")]; so it suffices to bound the increase in path length when adding K 
extra items to the tree T". Thinking again of the iterative construction, 
every ball trickles down until it finds a leaf. Then, either it sits there if there 
is room left, or it triggers a growth of the tree. It is important to notice that 
only these s + 1 balls may move. Furthermore, the increase in depth of any 
of the s + 1 items (the last one, plus the s that were already sitting at the 
leaf) is at most the height of the final tree Hn+x- Hence, upon adding K 
items, the path length increases by K{s + l)Hn+K < CKlii{n + K), by the 
results of [13] on the height of split trees. □ 

Write = E[$(rW)]. Then Lemma 4.6 ensures that, for any node r G 
R' Ci Rz, we have Tr = fz + 0{'yBliiB). By using (39) and Lemma 4.6, 
from (42) we obtain 



E 



^^(^n.) =^B[\R'nRz\]ifz + Oi^BlnB)) 
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+ 0{e^nlnB) + 

(43) 



51/4 



^B[\R'nR,\]f, + 0{jnlnB) 



+ 0{e^nlnB) + 



n\nB 



since E[\R\]=0{n/B) by (40). 

So the contribution of the fringe is essentially a mixture of the fz, z £ Z. 
To complete the proof of Proposition 4.2, it suffices to estimate the mixing 
measure E[\R' (1 Rz\], z £ Z . We first focus on the asymptotics for E[|i?2|], 
z £ Z. The following result is obtained by an application of the key renewal 
theorem. 

Lemma 4.7. Fixe > and let S := {1, 1 — 7, 1 — 27, . . . where 7 = 5^. 
Let d = sup{a : P(lny G oL) = 1}. If d> 0, we suppose that InB £ dN. Then 
for any a £ S we have, as 00, 

E[|i?Q,_B|] _ Jcq, + o(1), if InV is nonlattice {d = 0), 

n/B \'ipa{^T^'>T') + o{l), if InV is d-lattice {d > 0) , 

for a constant Ca (only depending on a and ipa{') is the d-periodic func- 
tion given in (48) below. 



(44) 



Proof. Let Vj,j > 1, be i.i.d. copies of V. For an integer k, write Sk - 
-^^^]^lnV^'. Then, by definition, for a G 5, we have 

n\RaB\] = ^P{u£RaB) 
u&U 

oo 

= ^6'=+^(P(5fc - In Vfc+i > ln{n/B) - In a and Sk < ln{n/B)) 

- P{Sk - InVk+i > Hn/B) - ln(a - 7) 

and Sk < ln(n/S))) 

rln{n/B) 



/ 6P(ln(n/B) -t-lna 

Jo 



< -InVfc+i <ln(n/S) - t-ln(a-7))d[/o(t), 

where ?7o(^) = U{t) + 1 is a simple modification of the renewal U{t) = 
^^>^6'^P(5fc <t) defined in (21). Thus, seeing E[|i?QB|] as a function of 
ln(n/i?) and writing 

(45) H{q):= r bP{q - t -Ina < -InVk+i < q - t -ln{a - j)) dUo{t), 
Jo 
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we have E[|i2QB|] = H{\n{n/B)). So we are after the asymptotics for H{q), 
as g — 7- oo. It is convenient to use a change of measure to relate H{q) to 
a renewal function associated to a probability measure. We have 

H{q):=e~''H{q) 

(46) = r e-'^'^-''>G{q-t)e-'dUo{t) 

Jo 

= / 6e-(5-*)p(g-t-lnQ<-lnVfc+i <g-t-ln(a-7))(iF(t), 
Jo 

where F{t) is the standard renewal function already introduced in (26). The 
asymptotics for the integral above are then easily obtained by using the key 
renewal theorem. In particular, they depend on whether \nV is lattice or 
not. 

(i) If Inl^ is nonlattice, by the key renewal theorem ([22], Theorem II. 4. 3), 
we obtain 

^ b f°° 

(47) lim H{q)=Ca:=- e"*P(i - Ina < -InF < t - ln(a - 7)) dt. 

Note that the constant Cq only depends on a (and 7) and that Ylaes ^ 
Thus, since H{x) = e~^H{x) it follows immediately that E[|iiQ,5|] = 
+ o(§) which proves the nonlattice case in (44). 

(ii) Similarly, if \nV is lattice with span d, the key renewal theorem 
(see [22], Theorem II. 4. 3, or [32], Theorem A. 7) implies that 

H{q)^Mq) 

(48) 

= — Yl e'"^~'''P{Q- kd - lna< -InV <q-kd-ln{a-j)) 

^ k:kd<q 

as (7 —7- 00. Note that ipa is a (positive) d-periodic function. Observe also that 
for fixed a, the function tpai') is not continuous since InV £ dZ almost surely. 
Since H{x) = e-^F(x), it follows from (48) that E[\RaB\] ~ f Va(ln(n/S)). 
This proves the lattice case in (44), and completes the proof. □ 

With Lemma 4.7 in hand, we can now deduce the asymptotics for E[|i?'n 
Rz\], z G Z and use them in (43) to complete the proof of Proposition 4.2. Re- 
cah that R' = {re R:\nr -nLr\ <B'^/^}. Clearly, B[\R'nRaB\] <E[|i?„B|]. 
Furthermore, 

B[\R' n RaB\] = ^Pilrir - nLr\ < -6^/3, (a - 7)^ < nU < aB) 

r&R 
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= P((a - 7)5 < nLr < aB) 

r£R 

X P{\nr - nLr\ < ^2/3 I (a - j)B < nLr < aB) 

>b[\r^b\]{i-0{b-'/')) 

by Lemma 4.1. We now choose B = e~^'^ so that = e^. 

(i) If \nV is nonlattice, it follows from Lemma 4.7 that for each choice 
of 7 there is a constant such that for all a E 5" and some constant Cq 
(that of Lemma 4.7) we have 



E[\R'nRaB\ 



<-f^ + 0{B 



7' + 0(e') = 0(^'), 



whenever n/B > K^. So for all n large enough, since fx = O(xlnx), we have 



E 



^ vi;(r"'-) = ^ c^^fo^B + ^ E 0(/aB£') + 0(n7lni?) + 0{e^n\nB) 



-reR 



n 



aG5 

en). 



This proves Proposition 4.2 when Inl/ is nonlattice. 

(ii) Similarly, if InV is d-lattice, for any choice of 7, there is a such 
that for any a G S" and some continuous d-periodic function ipa{t) [that of 
Lemma 4.7 defined in (48)], we have 



E[|/?'ni?„B| 



,„ V'a(lnn) < 72 + 0(5- 

n/B 

whenever n/B > K^. It follows that 



-1/4^, 



E 



^*(r-^) =^Va(lnn)|/«s + |EO(/-^^' 



(49) 



: n 



+ 0{n-/lnB) + 0{e^nlnB) 
E '^V'a(lnra) + 0(en). 



aG5 



This proves the claim in the lattice case with (pB defined by 
(50) 



It now only remains to prove that, although the functions ipai'), ot G 5", 
are not continuous, the d-periodic function ipB satisfies the bound in (19). 
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Lemma 4.8. The function ipB defined in (50) satisfies 
sup \ipB{q) - 'fB{q')\ < Keln{l/e). 

\q—q'\<e'^ 

Proof. From the expresssion for -i/'o in (48), we have 
^B{q) = -T^ V e'"'~'^P{q-kd + lnV e[Ha-j),lna)) 

^ aeS k : kd<q 



= ^ el'd-i {q - kd + \nV e[ln{a--f), In a)). 

^ k:kd<q aeS 

Note that, since 7 = 6'^ and a > e^, 

|ln(a — 7) — lna| ~ — 
a 

as e —7- 0. As a consequence, for all e > small enough, the intervals involved 
in the definition of ipa satisfy, uniformly in a G S*, 

— < |ln(a — 7) — lna| < e. 

In particular, since InV^ G almost surely, there is at most one atom in the 
interval as soon as e < d. It follows that, if we choose 6 = we have for 
any q, q' such that \q — q'\ < 6 

P{q' -kd+lnV £ [ln{a--f),lna))=P{q-kd + lnV£ [ln(a' - 7),lna')) 

for some a' in {0 + 7,0,0 — 7}. We adopt the following point of view: for 
fixed k and q, S induces a partition into the intervals [q — kd — ln(a),g — 
kd — ln(a — 7)), a £ S. Each interval contains at most one atom of —InV. 
Changing q into q' as above modifies the partition, but each atom may only 
move to an adjacent interval. All atoms of InV appear in both sums, except 
if one is so far that it escapes the range of the partition (recall that a > e^). 
So following the atoms of — In 1/ rather than the intervals in one or the other 
partition yields 

^\v>B{q) - ^B{q')\ 



< max y e'^'^-^'+'^y 1 



max 



faB fa' 



B 



+ max V e'^'^-^^ 



B B 

X P{x -kd + \Q.V £ [ln(Q; - 7),lna)) 

kd-xfe'^B 
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where the second term accounts for the escape of one atom. It follows that 
< max V e'"^-^+'^ VK7lnS-P(x- ifcd + lny G [ln(a-7),lna)) 

^^'^'k:kd<x aeS 

for some constant by Lemma 4.6 and the asymptotics for fz- Swapping 
the sums once again to recover the functions ipa{')i it follows that 

Wsiq) -^B{q')\ < — K-fe^ In B -supy^ipaix). 

However, since every summand is nonnegative, we have for any x 

0<Y,Mx) = — Yj e'''^"'' (x - kd + InV e Ma--/) Ma)) 

aeS ^ k : kd<x aeS 



(51) 



~ a ^ ~ u ' 

^ k:kd<x ^ 



The desired bound follows: for any q, q' such that \q — q'\ < we have 

\'PB{q)-^B{q')\<K"eHl/e) 
for some constant K" independent q,q' or e. □ 

5. Extensions and concluding remarks. 

5.1. An alternative notion of path length. The notion of path length we 
have considered so far is the sum of the depths of the items in the tree. This 
is most natural when one thinks about performance measures for algorithms 
or sorted data structures. However, for some applications, it is sometimes 
important to introduce a related notion of path length T(T), that is the 
sum of the depths of nodes: 

T(r):= j;|u|l{„gT} = E^- 

where Nu denotes the number of nodes in the subtree rooted at u. This 
notion of path length appears, for instance, in the analysis of cutting-down 
processes. Suppose that you are given a rooted tree T. Initially, the process 
starts with T. At each time step, a uniformly random edge is cut, the por- 
tion of the tree that is disconnected from the root is lost, and the process 
continues with the portion containing the root. How many random cuts does 
it take to isolate the root? The question originates in the seminal work of 
Meir and Moon [40, 41]. Recently, the subject has regained interest, and 
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new results have been proved about the weak hmit of the number of cuts 
when the initial tree is randomly picked according to various distributions. 
See [16, 27-29, 31] for more references and details about the precise models 
and results. 

For instance, Holmgren [28] has proved that, when the initial tree is a split 
tree satisfying two general conditions (one on E[T(T")] and one on the num- 
ber of nodes), the normalized number of cuttings converges in distribution 
to a weakly 1-stable law (Theorem 1.1 there). Our Theorem 3.1 allows us to 
prove that one of the conditions assumed in [28] actually implies the other. 
More precisely, the conditions assumed in [28] are that T(T") (the path 
length of nodes) satisfies 

E[T(r")] = -nlnn + + o{n), 

and that the number of nodes N = |T"| verifies, for some constants a > 
and e > 0, 

/ Tl 

(52) B[N]=an + f{n) where /(n) = O 

We deduce from Theorem 3.1: 



In^+^n 



Corollary 5.1. Suppose that InV is nonlattice, and assume that (52) 
holds true; then, as n — t- oo, 

E[T(r")] = -nlnn + + o(n). 

Remarks. The assumption in (52) is just slightly stronger than the 
estimate proved by Holmgren [26], that is, that for split tree with non- 
lattice InV, we have f{n) = o{n). Moreover, the assumption in (52) does 
make sense, since it is known to hold, for instance, for m-ary search trees 
[2, 10, 36, 39]: for such random trees, f{n) is o(-y/n) when m < 26 and is 
0{v}~^) when m > 27. On the other hand, it is also known that the condition 
in (52) does not always hold. For instance, Flajolet et al. [20] proved that, 
in the case of binary tries generated by a memoryless source with probabil- 
ities pi,j32 such that (logpi)/(logp2) is a Liouville number, then the error 
term /(re) can come arbitrarily close to 0{n) [but of course, stays o(n)]. 
See [20], page 249, and the monograph by Baker [3] for more information 
about Liouville numbers. 

Sketch of proof. Define q{n) and r(re) by 

1 a 
E[^(r")] = -relnre-hng(n) and E[T(r'')] = -nlnre -h nr(n). 
// n 

Let A„ := anq{n) — nr(n), and note that 

(53) A„ = aE[^(r")] -E[T(T")]. 
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Since, by Theorem 3.1, q{n) converges as n— t-oo, it suffices to prove tliat 
An/n also converges to some constant. From (53) and tlie assumption in (52) 
we obtain 



An = aE 



(54) 



E 



Y.0 



E 



( any + O ( 



n,; 



log 



l+e 



n, 



[The constants hidden in the O(-) above are the same for every term.] 

Consider the subtrees T^, r £ R, introduced in the course of the proof 
of Theorem 3.1. Recall that a node r is in if it is the first on its path 
from the root such that riLr < B, for some parameter B. In the following, 
we take B = 6~^, for 6 > 0. We now show that the main contribution to A„ 
is accounted for by the nodes in the subtrees T^, r £ R; in other words 
An = EErefl ^"r] + oin) , where 

= aE[^{Tr)\nr] - E[T(r,)|n,]. 
To see this, observe that we deduce from (54) and (52) that 



E 



E 



E 



■v<^Tr,r£B., 



rir. 



log 



l+e 



n, 



E E 

fc>0 v(^Tr,r&R, 
2'=<n„<2'=+i 



o 



rir. 



log^+^n„ 



+ 



n 



logn 



We split the sum in k above at some constant K to be chosen later. By 
Lemma 4.1 and since the expected number of nodes v G with nL^ > B 
is 0{n/B), we obtain 

0<fc<A' 



E 



tS-R 



k>K 



o 



n 



2k J.l+e 



n 
B 



+ o(n) 



= OinK^") + 0{nK2^/B) + o{n). 

We choose K = \_aln{l/5)\ , for some small constant a > 0. Since 5 > was 
arbitrary, the claim follows. 

Now since An^. — 0(727- In ti^.), the proof of Proposition 4.2 (in the nonlat- 
tice case) may be extended to show that ^lY^reR ^"rl = + o{n) for some 
constant The details are omitted. □ 

5.2. Beyond split trees and multinomial partitions. To conclude, we in- 
dicate the lines of the arguments to extend the applicability of our main 
theorem to a greater family of random trees. The model of split trees [13] 
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supposes that the distribution of the subtree cardinahties ni,n2, ■ ■ ■ ,nb of 
a node of cardinahty n is exactly of the form 

(55) (ni,n2, ...,nb)= Mult(n - sq - bsi,Vi,V2,.. ■ ,Vb) + (si,si, . • • ,si) 

for a random vector (Vi, . . . , VJ,); in particular, the vector (Vi, . . . , Vb) cannot 
depend on n. Ahhough many important data structures satisfy this property, 
some other more combinatorial examples do not; see, for instance, the case 
of increasing trees [5] . 

Also, the reader might have noticed that our proof does not quite use 
the full strength of the assumption in (55). Indeed, our proof mainly uses 
two facts: first, that the sequence of subtree sizes along a branch is well 
approximated by the product form nLu = n'Y\v-<u^'"^ which modulo some 
details about C(V), implies that 

b 

X = ^I4X(^) + C(V); 
fc=i 

and second, that the addition of some items to the tree only modifies mod- 
erately E[^'(r)] (see Lemma 4.6). 

The two requirements are satisfied when the items are distributed in sub- 
trees according to (55). We now indicate why our result would still hold 
under the much weaker condition that there exists a vector V = {Vi, . . . , Vb) 
such that the cardinalities rii,. . . ,nb of the children of a node of cardinality 
n satisfy 

(56) (—,—,...,—] ^{Vi,V2,...,Vb) in distribution 
\ n n n J 

as n — 7- oo. Of course, the copies of the limit vectors V at distinct nodes 
should be independent. The general shape of trees under this model has 
recently been completed by work by Broutin et al. [8] (see also Drmota [15] 
who treats the model of increasing trees by Bergeron et al. [5] more directly). 

One should be easily convinced that the relaxed condition in (56) should 
be sufficient for the result to hold: 

• Proposition 4.1 may be extended using the coupling arguments already 
used in [8] , proving that the contribution of the top of the tree to the path 
length may be estimated using renewal functions associated to the limit 
vector V. 

• Similarly, the extension of Proposition 4.2 relies on the same coupling 
argument (the overshoot there is still approximated by that of the limit 
vector). Here, it is important to note that the proof of smoothness of the 
path length (Lemma 4.6) requires the existence of a fixed function g such 
that the size |r"| of a "generalized" split tree of cardinality n satisfies 
IT"! < g{n) with probability 1 (at least our proof does). This was already 
necessary for the results on the shape of the trees in [8] to hold. The 
constraint is not too strong, since it holds as soon as sq or s\ is nonzero. 
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and any function would do, regardless of its growth. (This is another 
reason why the case of digital trees should be treated separately: for such 
trees, the size of a tree containing two items can be arbitrarily large.) 
• As already noted in Section 3, the part of the proof relative to the contrac- 
tion method in [45, 46] will go through as long as the coefficients C„(n) 
converge, and the expansion for mean implies their convergence. 
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